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Abstract 

Background: Body size is intimately related to the physiology and ecology of an organism. Therefore, accurate 
and consistent body mass estimates are essential for inferring numerous aspects of paleobiology in extinct taxa, 
and investigating large-scale evolutionary and ecological patterns in the history of life. Scaling relationships 
between skeletal measurements and body mass in birds and mammals are commonly used to predict body mass 
in extinct members of these crown clades, but the applicability of these models for predicting mass in more 
distantly related stem taxa, such as non-avian dinosaurs and non-mammalian synapsids, has been criticized on 
biomechanical grounds. Here we test the major criticisms of scaling methods for estimating body mass using an 
extensive dataset of mammalian and non-avian reptilian species derived from individual skeletons with live 
weights. 

Results: Significant differences in the limb scaling of mammals and reptiles are noted in comparisons of limb 
proportions and limb length to body mass. Remarkably, however, the relationship between proximal (stylopodial) 
limb bone circumference and body mass is highly conserved in extant terrestrial mammals and reptiles, in spite of 
their disparate limb postures, gaits, and phylogenetic histories. As a result, we are able to conclusively reject the 
main criticisms of scaling methods that question the applicability of a universal scaling equation for estimating 
body mass in distantly related taxa. 

Conclusions: The conserved nature of the relationship between stylopodial circumference and body mass 
suggests that the minimum diaphyseal circumference of the major weight-bearing bones is only weakly influenced 
by the varied forces exerted on the limbs (that is, compression or torsion) and most strongly related to the mass of 
the animal. Our results, therefore, provide a much-needed, robust, phylogenetically corrected framework for 
accurate and consistent estimation of body mass in extinct terrestrial quadrupeds, which is important for a wide 
range of paleobiological studies (including growth rates, metabolism, and energetics) and meta-analyses of body 
size evolution. 



Background 

In extant taxa, body size is recognized as one of the 
most important biological properties because it strongly 
correlates with numerous physiological and ecological 
factors, such as metabolic rate [1-3], growth rate [4,5], 
fecundity [6], diversity [7], and population density [8,9], 
as well as home range and land area [6,10,11], which are 
related to the productivity of the host environment [12]. 
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Due to these relationships, estimates of body mass (the 
standard measure of body size) are essential for inferring 
the paleobiology of extinct taxa, and investigating large- 
scale evolutionary and ecological patterns in the history 
of life. 

Due to the biological implications of body size, it is not 
surprising that numerous paleontological studies have 
used body mass estimates to reconstruct and interpret: 
patterns of body size evolution [13-22], brain-size allome- 
try and evolution [23-26], the evolution of reproduction 
[27-29], growth rates [30,31], postural allometry and loco- 
motion [14,32,33], metabolism [34-36], paleotemperature 
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[37], visceral organ size [38], and community and trophic 
structures [10,39,40]. In order to infer these biological 
properties, studies require the use of an estimate or proxy 
of body size, which can have a large effect on the final 
interpretation. As a result, it is important to understand 
the set of assumptions/errors incurred by body size esti- 
mates and proxies. 

Currently, there are two types of methods used to esti- 
mate body mass in extinct animals: volumetric reconstruc- 
tions and skeletal scaling relationships. The latter method 
is commonly used to predict body mass in extinct mem- 
bers of relatively recent crown clades (that is, of Mesozoic 
origin) such as Mammalia and Aves [21,41-45]. However, 
in stem groups (for example, non-avian dinosaurs and 
non-mammalian synapsids), estimations are often based 
on volumetric reconstructions, which involve physical 
three-dimensional scale models [46,47], graphic double 
integration of two-dimensional reconstructions [48-50], or 
computer-generated life reconstructions [51-55]. Such 
estimates are widely used in the literature (for example, 
[35,38]) despite the fact that they are prone to a consider- 
able amount of error. In a typical example, body mass esti- 
mates for a single mounted skeleton of Brachiosaurus 
brancai recently published by the same research group 
have resulted in estimates of 38 tonnes and 74.4 tonnes 
[54,56]. Such differences in estimates are the result of dif- 
fering interpretations of a multitude of factors associated 
with the mass and proportion of an organism's tissues and 
organs [57], or, perhaps most importantly, the effects of 
air sacs and lungs, which will likely have a large effect on 
specific gravity (the total body density of the animal in 
relation to water), needed to estimate mass from a volume. 
Within non-avian reptiles specific gravity has been noted 
to range from 0.8 to 1.2 [46,48]; however, given the vary- 
ing levels of bone pneumaticity observed in saurischian 
dinosaurs [58,59], and the fact that birds typically exhibit 
lower densities than mammals and other reptiles [60], it is 
almost certain that the specific gravity of extinct animals 
also varied [59]. As a result, assumptions based on a set 
density parameter will considerably affect a mass estimate 
[54,56]. Perhaps more importantly, the numerous assump- 
tions about soft tissue properties and body shape (for 
example, muscle sizes) in many of the models make it dif- 
ficult to control for sources of error and to determine the 
confidence associated with a given mass estimate, although 
recent computational modelling advances attempt to out- 
line maximum and minimum body mass bounds (for 
example, [54,61,62]). Despite the complications associated 
with life reconstructions of extinct taxa, models are impor- 
tant for testing numerous biomechanical hypotheses 
[61,63-68]. Therefore, it is important that models be con- 
strained by data derived from extant taxa, such as those 
obtained from scaling relationships. 



An alternative method to reconstructions, and one that 
can be used to test and constrain scale and computational 
models ([55]), is the use of scaling relationships between 
body mass and skeletal dimensions derived from extant 
taxa. A skeletal measure, if strongly related to body mass, 
will provide an estimate that controls for the sources of 
error associated with making a reconstruction, such as 
determination of tissue volume and specific gravity, which 
are virtually impossible to constrain in life-reconstructions. 
Furthermore, skeletal measurements are generally easier to 
obtain than full body scale reconstructions, especially for 
taxa that are only partially preserved, and are therefore 
more practical estimators in large-scale evolutionary and 
ecological studies (for example, [15-17,20]). Finally, the 
variation in the extant dataset can be used to quantify the 
degree of confidence in the estimated parameter, and can 
thus provide a range in which a particular body mass is 
likely to fall, thereby providing a constraint for estimates 
produced by reconstructed models. Scaling methods are 
almost universally accepted as a means to estimate body 
mass accurately for extinct taxa of crown groups, such as 
mammals and birds (for example, [17,42]), but have been 
extensively criticized when applied to more distantly 
related stem taxa that fall outside the body size range 
observable in extant representatives, such as Indricother- 
ium [69], xenarthrans [43], and non-avian dinosaurs 
[70-72]. For the first two groups, studies have since shown 
that scaling relationships still provide the most reliable 
mass estimates [43,69]. 

Dinosaurian body masses are still generally estimated 
using reconstructions, with the exception of two studies 
[45,73]. The pioneering work completed by Anderson 
et al. [73], herein referred to as the Anderson method, 
suggested that the body mass of dinosaurs could be esti- 
mated using the measured scaling relationship between 
live mass and total circumference of the stylopodia 
(humerus + femur) derived from a sample of 33 species of 
extant terrestrial mammals. Although the Anderson 
method provides a more objective way to estimate body 
mass in extinct taxa, it has been criticized by numerous 
authors (for example, [49,56,61,70,71,74-76]). Here we use 
an extensive dataset of extant mammals and non-avian 
reptiles compiled from individual skeletons of live-weighed 
animals, in order to directly test the three main criticisms 
made towards the use of a universal limb scaling relation- 
ship to estimate body mass in extinct terrestrial amniotes: 

1. The widely cited Anderson method, especially among 
non-avian dinosaur researchers, is criticized based on its 
use of a taxonomically biased sample towards ungulates 
(for example, [70]). Studies examining limb-scaling pat- 
terns in mammals have noted that the limb proportions of 
ungulates differ from those of other mammals [70,77,78]. 
However, whether ungulates differ from other groups of 



Campione and Evans BMC Biology 2012, 10:60 
http://www.biomedcentral.eom/1 741 -7007/1 0/60 



Page 3 of 22 



mammals in their scaling patterns of limb circumference 
to body mass has not been directly tested. 

2. Differences in gait and limb posture impart different 
stress regimes on the limbs [79,80]. These differences 
may affect limb morphology, thereby negating the 
applicability of a single equation to estimate body mass 
in a variety of extinct vertebrates. Given different stress 
regimes, we test for differential limb scaling between 
animals of various gaits and limb posture by comparing 
differently sized sub-samples of mammals, and parasa- 
gittal mammals to sprawling reptiles. 

3. Residual outliers (large residual values) and extreme 
outliers (values at the upper and lower extremes of the 
dataset) can have a large effect on regression coefficients 
[81]. The problem of residual outliers in the large-bodied 
mammalian sample of Anderson et al. [73] was discussed 
by Packard et al. [82]. We have expanded the sample size 
of the large-bodied dataset and will address the effect that 
potential residual outliers have on the circumference to 
body mass relationship. The effect of extreme outliers on 
limb scaling is, in part, mediated by logarithmic transfor- 
mation of the data, but will also be assessed through size 
class comparisons. Although the issue of body mass extra- 
polation to giant extinct taxa (for example, Sauropoda; 
[50,72]) will always exist, the vast majority of extinct ani- 
mals, including most non-avian dinosaurs, fall within the 
body mass range of extant taxa. 

All three of these criticisms are tested for the first time, 
within the context of 200 mammal and 47 non-avian rep- 
tile species [See Additional file 1, Dataset]. Based on our 
results we develop a universal scaling equation between 
the total circumference of the stylopodia and body mass 
that is applicable to all terrestrial quadrupeds, and permits 
estimation of body mass in extinct taxa along with an 
error factor that can constrain estimates for use in future 
paleobiologies studies. 

Results 

Raw data results 

Results from the standardized major axis (SMA) analyses 
comparing clades based on the raw non-phylogenetically 
corrected data are provided in Figures 1 and 2, and Table 1; 
comparisons are summarized in Tables 2 and 3. Size class 
comparisons are presented in Figure 3 and Tables 4 and 5. 
All analyses show strong correlations with each other, and 
to body mass (that is, size) as indicated by a mean coeffi- 
cient of determination of 0.9446 ± 0.0093 for the clade 
comparisons, and 0.914 ± 0.014 for comparisons between 
size classes. 

In total, 80 pairwise comparisons are made between 
mammalian clades (Tables 1 and 2). Of these compari- 
sons, the 95% confidence intervals indicate 12 significant 
differences between scaling coefficients and 13 signifi- 
cant differences between intercepts. In comparison, the 



likelihood ratio test, the results of which are adjusted 
for multiple comparisons using the false discovery rate 
(FDR), reveals 14 significant differences between slopes, 
and a t-test of the true intercepts indicates ten signifi- 
cant differences; however, when the intercept is cor- 
rected and compared at a more biologically meaningful 
value, the minimum value along the x-axis, the t-test 
indicates that there are no significant differences in 
intercept. 

Regardless of the comparison method used, the most 
significant variation is noted in the scaling of stylopodial 
proportions (length to circumference) of the humerus and 
femur, as well as in the scaling of humeral and femoral 
lengths with body mass (Figure 1; Tables 1 and 2). This is 
especially true for ungulates, which possess stylopodial 
proportions and lengths that scale significantly different 
from all other groups examined here. No significant differ- 
ences in scaling coefficients were recovered in the scaling 
of either the humeral or femoral circumference to body 
mass using the likelihood ratio test, and only two differ- 
ences were recovered by the 95% confidence interval com- 
parisons in the scaling of humerus circumference to body 
mass (Marsupialia scales significantly higher than Ungu- 
lata and Carnivora). 

In total, ten and 13 significant differences were noted 
in comparisons between intercepts using confidence 
intervals and a t-test, respectively, including a significant 
difference in the intercept of Carnivora and Glires using 
95% confidence intervals in the comparison of total sty- 
lopodial (humerus + femur) circumference and body 
mass. However, visual inspection reveals major overlap 
between the data points at the minimum values along 
the x-axis (Figure 1) suggesting that significant differ- 
ences may be due to extrapolation of the SMA line to a 
value of x = 0. This is likely a valid interpretation as an 
adjusted t-test comparing the intercepts at the minimum 
values along the x-axis (Table 2) indicates that inter- 
cepts are not significantly different between mammalian 
groups in any of the comparisons made here. 

Mammalian and reptilian scaling patterns show similar 
scaling coefficients, overall. Of the eight comparisons, 
two scaling coefficients showed significant differences 
using both the 95% confidence intervals and the likeli- 
hood ratio test. More specifically, the humeral propor- 
tions and humeral length to body mass in reptiles scale 
above that observed for mammals (Figure 2; Tables 1 
and 3). Comparison of the confidence intervals revealed 
significant differences in the intercepts of mammals and 
reptiles in the relationship between femur circumference 
and body mass, as well as humerus length to body mass. 
However, these differences were not recovered by either 
t-test. When the circumference of the humerus and 
femur is combined, all tests indicate that the total stylo- 
podial circumference to body mass relationship of 
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Figure 1 Limb scaling patterns in mammalian clades. Lines are fitted based on the SMA results presented in Table 1. (A) Log femoral length 
and circumference plotted against log body mass. (B) Log humeral length and circumference against log body mass. (C) Log femoral length 
plotted against log humeral length. (D) The log of combined humeral and femoral circumference against log body mass. SMA, standardized 
major axis. 



reptiles is statistically indifferentiable from that of 
mammals. 

Finally, in order to assure that the results obtained for 
mammals and reptiles are not influenced by differences 
in body size range in the two samples, we re-ran the ana- 
lyses using a subset of the mammalian dataset (N = 174), 
which corresponds to all mammals equal to, or below, 
the mass of the Alligator mississippiensis specimen (168 
kg), the largest reptile measured in this study. In general, 
results of this pruned analysis were similar to those 



obtained with the entire mammalian dataset (Table 3) 
[See Additional file 2, Table SI]. In particular, compari- 
sons of slopes based on the likelihood ratio test are iden- 
tical. Differences between the two analyses were noted in 
comparisons using the 95% confidence intervals in which 
the pruned analysis revealed an additional difference in 
the scaling of femoral length and circumference between 
mammals and reptiles, but failed to recover a significant 
difference between intercepts in the scaling of femoral 
circumference to body mass. The t-test on the pruned 
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Figure 2 Limb scaling patterns in quadrupedal terrestrial tetrapods. Lines are fitted based on the SMA results presented in Table 1. 
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data also revealed an additional difference between the 
intercepts of mammals and reptiles in the relationship of 
humeral length to body mass as well as femoral to hum- 
eral length. Despite differences in the scaling of stylopo- 
dial length, no significant differences were noted in the 
scaling of stylopodial circumference to body mass 
between mammals and reptiles. 



Size class comparisons, based on the mammalian dataset 
(N = 200), at three different thresholds reveal greater varia- 
tion in scaling patterns between subsamples at lower body 
size thresholds (Tables 4 and 5), although this may be due 
to the small sample size in the large body size class at the 
100 kg threshold (N = 36). In particular, the limb propor- 
tions of the humerus scaled differently in animals smaller 
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Table 1 Stylopodial scaling in mammals and non-avian reptiles. 
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-0.1248 to -0.2166 


0.9771 
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Table 1 Stylopodial scaling in mammals and non-avian reptiles. (Continued) 



Rfsrvh 1 1 "i 
RcrU LI II a 


46 


0 9644 


1 D644 tn 0 8739 


00190 


0 1 76 tn -0 1 379 


0 8943 


1 1 nn 1 1 tats 

Ul IU U 1 u la 


32 


1 .0260 


1 107 tn 0 9509 

Liu/ lu \j.j~){jy 


-0.1452 


0 0491 tn -0 3395 


0.9584 


C 3 rn ivnra 

Lul IIVUI u 


46 


0.9741 


1 0283 tn 0 9??7 


0.0232 


0 1 338 tn -0 0874 

W. 1 J JO LU \J XJkj 1 ^ 


0.9682 


K/la re i ir\)p na 
[Vial uia I la 


1 4 


0 9372 


1 1 18? tn 0 78^6 

I . I I OZ. LU U./ OJU 


-0 0224 


0 7RQ4 tn -f) 3344 


0 9204 


Fi isrrhnnta 

LUa 1 l_l IUI 1 Id 


1 4 


1 2075 


1 493R tn 1 0741 


-05127 




0 9307 


Glires 


66 


1 .0625 


1.1019 to 1.0246 


-0.2177 


-0.1536 to -0.2818 


0.9788 


C H+F vs. BM All 


247 


2.7779 


2.8191 to 2.7374 


-1.1564 


-1.086 to -1.2267 


0.9863 


Mammalia 


200 


2.8071 


2.8495 to 2.7654 


-1.2289 


-1.1541 to -1.3037 


0.9886 


Reptilia 


■'17 


2.7933 


2.9496 to 2.6452 


-1 .0833 


-0.8636 to -1.3031 


0.9671 


Ungulata 


41 


2.7319 


2.8959 to 2.5773 


-1 .0660 


-0.6989 to -1.4331 


0.9676 


Carnivora 


48 


2.6921 


2.7969 to 2.591 1 


-0.9568 


-0.7669 to -1.1466 


0.9834 


Marsupialia 


14 


2.9125 


3.1855 to 2.6628 


-1.3738 


-0.9658 to -1.7817 


0.9797 


Euarchonta 


15 


2.8692 


3.0561 to 2.6937 


-1.3928 


-1.0911 to -1.6946 


0.9889 


Gli res 


66 


2.8850 


3.0113 to 2.764 


-1.3260 


-1.1561 to -1.4960 


0.9705 



Standardized Major Axis equation shown in the format y = mx + b. The particular theoretical scaling model (Sim.) followed by the slope is represented by G, 
geometric similarity, E, elastic similarity, or S, static similarity. Scaling patterns that fall between models are represented by > or <, and those that do not follow 
any pattern (that is, above or below all predicted models) are represented by a 0. BM, body mass; C F , femoral circumference; C H , humeral circumference; C h+ f, 
total humeral and femoral circumference; CI, confidence interval; L F , femoral length; L H , humeral length. 



than 20 kg compared to those larger than 20 kg, a pattern 
also noted at the 50 kg threshold. A significant difference 
in the proportional scaling of the femur is also noted at 
50 kg. Significant differences were noted in the scaling of 
humeral length to body mass between individuals at the 
20 kg and 50 kg threshold. As in the mammalian and repti- 
lian comparisons, no significant differences were noted in 
the scaling of combined circumference and body mass 
between different size classes (Figure 3; Table 5). 

Independent contrast results 

Overall, phylogenetically corrected scaling relationships 
reveal lower coefficients of determination than the raw 
data. The mean R 2 (0.9126 ± 0.0105) for the corrected 
data is significantly lower than that obtained from the 
raw data (two tailed t-test: t = -4.4721; P < < 0.0001). As 
a result, fewer significant differences were noted between 
mammalian clades and between mammals and reptiles 
[See Additional file 3, Tables SI and S2]. Of the 80 mam- 
malian comparisons made, two showed significant differ- 
ences recovered by both the 95% confidence intervals 
and the likelihood ratio test. The differences include a 
significantly lower scaling coefficient of Carnivora com- 
pared to Glires and Ungulata, in the scaling of femur 
length to humerus length. Confidence intervals indicate 
two other differences when the data is corrected in which 
the humeral length of reptiles scales significantly higher 
than that of mammals when compared to body mass and 
the humeral circumference in ungulates scales higher 
than that of carnivorans when compared to body mass. 

Most importantly, however, based on the confidence 
intervals, comparisons between scaling coefficients 
obtained from the raw data (Table 1) and the phylogen- 
etically corrected data [See Additional file 3, Table S2] 



reveal only a single significant difference for the scaling 
of humeral proportions in Glires. Other than that com- 
parison, the lack of significant differences between the 
raw data and phylogenetically corrected data suggest 
that phylogeny does not play a significant role in dictat- 
ing the scaling patterns tested here with regards to the 
major weight-bearing bones in terrestrial tetrapods. For 
this reason, and for ease of comparison with previous 
limb scaling studies, further discussion will be based on 
results obtained from the raw data. 

Discussion 

Skeletal limb morphology in vertebrates is considered to 
reflect a trade-off between the energetic requirements 
imposed by movement and the functional requirements 
imposed by loadings on the bone from behavioral qualities 
and/or body size [78,83-88]. Biomechanical studies using 
in vivo strain gauges and force platforms in mammals and 
birds have concluded that peak functional strains (that is, 
safety factors, strain at which yield or failure occur/peak 
functional strain) placed on a limb bone during locomo- 
tion are consistent among taxa of different size and differ- 
ent lifestyles (for example, terrestrial, aquatic, and aerial; 
[80]). However, in non-avian reptiles, safety factors are 
higher compared to mammals suggesting that functional 
strains are lower in the former [79,89,90] . Nevertheless, in 
order to mitigate decreases in safety factors associated 
with increases in body size, the architecture of the skeletal 
limb, such as limb robustness, cortical thickness, and/or 
curvature, are expected to vary [80,86,88,91]. 

Interspecific limb scaling patterns are often used to test 
theoretical biomechanical models, such as geometric, elas- 
tic, and static similarity, which predict scaling patterns 
based on biomechanical observations and/or assumptions 
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Table 2 Slope and intercept comparisons of stylopodial scaling patterns in mammalian clades. 

m 95% CI m LRT b 95% CI 

Marsupialia Glires Euarchonta Ungulata Marsupialia Glires Euarchonta Ungulata Marsupialia Glires 

L F vs. C F Carnivora 
Ungulata 

Euarchonta - - - 



Glires 

L H vs. Ch Carnivora 
Ungulata 
Euarchonta 
Glires 

L F vs. BM Carnivora 
Ungulata 
Euarchonta 
Glires 

C F vs. BM Carnivora 
Ungulata 
Euarchonta 
Glires 

L H vs. BM Carnivora 
Ungulata 
Euarchonta 
Glires 

C H vs. BM Carnivora 
Ungulata 
Euarchonta 
Glires 

L F vs. L H Carnivora 
Ungulata 
Euarchonta 
Glires 



C H+F vs. Carnivora 
BM 



Ungulata 

Euarchonta - - - - 

Glires 

Standardized major axis equation shown in the format y = mx + b. Symbols: (°) represents differences at 90 to 95% (0.1 <P > 0.05); (*) at 95-99% (0.05 <P > 0.01); and 
comparisons. Significant differences using 95% CI are assessed on whether the intervals overlap or not; non-overlapping comparisons are indicated with an asterisk (*). 
circumference; C H , humeral circumference; C h+ f, total humeral and femoral circumference; FDR, false discovery rate; L F , femoral length; L H , humeral length; LRT, 
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b t-test b' t-test 

Euarchonta Ungulata Marsupialia Glires Euarchonta Ungulata Marsupialia Glires Euarchonta Ungulata 



(**) at greater than 95% (P < 0.01). Otherwise, P-values are > 0.1. All P-values are adjusted for multiple comparisons using FDR. Hyphens (-) represent duplicate 
95% CI, comparisons based on 95% confidence intervals; b', intercept adjusted to correspond to the minimum value along the x-axis; BM, body mass; C F , femoral 
comparisons based on a likelihood ratio test (slope only); t-test, comparisons based on a two-tailed t-test (intercept only). 
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Table 3 Slope and intercept comparisons of stylopodial 



scaling patterns in mammals and non-avian reptiles. 





All Data 


Mammals < 168 kg a 




mCl mP 6CI 


bP b'P mCl mP bCI bP bV 


L F vs C F 






Lh vs C h 






L F vs BM 


C F vs BM 






L H vs BM 






C H vs BM 


L F vs L H 


C H+F vs BM 



Standardized major axis equation shown in the format y = mx + b. Symbols: (°) 
represents differences at 90 to 95% (0.1 <P > 0.05); (*) at 9 to 99% (0.05 <P > 
0.01); and (*») at greater than 95% (P < 0.01). Otherwise, P-values are > 0.1. All 
P-values are adjusted for multiple comparisons using FDR. Significant 
differences using 95% CI are assessed on whether the intervals overlap or not; 
non-overlapping comparisons are indicated with an asterisk (*). Comparisons 
based on a subset of the mammalian dataset that has the same body mass 
range as the total reptilian dataset [See Additional file 2, Table S1]; mCl, slope 
comparisons based on 95% confidence intervals; mP, slope comparisons based 
on likelihood ratio test; bCI, intercept comparisons based 95% confidence 
intervals; bP, intercept comparisons based on two-tailed t-test; b'P, t-test 
comparison of adjusted intercepts to the minimum value along the x-axis; BM, 
body mass; C F , femoral circumference; FDR, false discovery rate; L H , humeral 
length; C H , humeral circumference; C h +f, total humeral and femoral 
circumference; L F , femoral length. 

[70,77,78,84,85,92-94]. These theoretical models were for- 
mally presented by McMahon [95,96], who provided 
empirical support for elastic scaling in terrestrial verte- 
brates (using ungulates as a proxy), as opposed to a strict 
geometric (isometric) scaling. These models were subse- 
quently revisited by other authors who present empirical 
evidence that elastic similarity is restricted to ungulates 
with other mammals following either a geometric trend 
[77] or not clearly conforming to either the elastic or geo- 
metric theoretical models [85,93,94]. In general, empirical 
scaling studies of terrestrial mammals have found minor 
support for elastic similarity (see [87], for a full review). In 
reptiles, however, Blob [84] recovered significant support 
for elastic similarity in several regressions comparing limb 
diameters to body mass in varanids and iguanians. 

The results obtained here suggest that limb scaling in 
mammalian and reptilian clades exhibits a great deal of 
variation with respect to elastic and geometric similarity, 
and as suggested by Christiansen [85,93], depending on 
the variables being compared, clades and subgroups 
appear to follow a variety of scaling models, and no the- 
oretical scaling model can be used to describe all terres- 
trial vertebrates. However, this study suggests that 
elastic similarity is more prevalent than previously sug- 
gested, especially in the scaling of humeral circumfer- 
ence with body mass. Of the eight clades examined 
(Table 1), only a single group, Marsupialia, did not fol- 
low a significant allometric trend (that is, significantly 
different than geometric similarity), and six of the clades 




Log Humerus+Femur Circumference 

Figure 3 Limb scaling patterns in different mammalian size 
classes. Lines are fitted based on the SMA results presented in 
Table 4. All three comparisons plot the log total stylopodia 
circumference against log body mass in the mammalian sample of 
the dataset. Size class comparisons are based on previously studied 
thresholds discussed in the text [78,93,94]. Mammals above and 
below 20 kg (A), 50 kg (B), and 100 kg (C). SMA, standardized major 
axis. 
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Table 4 Stylopodial scaling in mammals of different sizes. 



Andlysis 
(x vs. y) 


Sample 


N 


m 


m 95% CI 


b 


b to 95% to CI 


R 2 


Sim 


L F vs. C F 


< 20 kg 


136 


0.8868 


0.9335 to 0.8424 


-0.3733 


-0.2921 to -0.4545 


0.9095 


0 




> 20 kg 


52 


1.0000 


1.1370 to 0.8795 


-0.4907 


-0.1714 to -0.8100 


0.7945 


G 




< 50 kg 


150 


0.9486 


0.9987 to 0.9009 


-0.4715 


-0.3819 to -0.5611 


0.8993 


G 




> 50 kg 


38 


1.1331 


1.2731 to 1.0084 


-0.8317 


-0.4935 to -1.1699 


0.8806 


> G, < E 




< 1 00 kg 


158 


0.9659 


1.0123 to 0.9216 


-0.5000 


-0.4155 to -0.5845 


0.9119 


G 




> 1 00 kg 


30 


1.1059 


1.2659 to 0.9661 


-0.7572 


-0.3679 to -1.1465 


0.8774 


G 


L H vs. C H 


< 20 kg 


135 


0.8778 


0.9248 to 0.8331 


-0.3345 


-0.2567 to -0.41 24 


0.9073 


0 




> 20 kg 


52 


1.1541 


1 .2900 to 1 .0326 


-0.7954 


-0.4848 to -1.1060 


0.8459 


> G, < E 




< 50 kg 


149 


0.9040 


0.9990 to 0.9040 


-0.4445 


-0.3613 to -0.5277 


0.9060 


0 




> 50 kg 


38 


1.1856 


1 .3524 to 1 .0394 


-0.8774 


-0.4879 to -1.2668 


0.8475 


> G, < E 




< 1 00 kg 


157 


0.9710 


1.0166 to 0.9274 


-0.4764 


-0.3967 to -0.5560 


0.9161 


G 




> 1 00 kg 


30 


1.1229 


1.3132 to 0.9602 


-0.7114 


-0.2646 to -1.1582 


0.8352 


G 


L F vs. BM 


< 20 kg 


136 


2.6288 


2.7825 to 2.4836 


-1.7421 


-1 .4756 to -2.0086 


0.8892 


0 




> 20 kg 


52 


2.8571 


3.2511 to 2.5108 


-1.8964 


-0.9788 to -2.8141 


0.7920 


G 




< 50 kg 


150 


2.7619 


2.9166 to 2.6754 


-1.9510 


-1.6751 to -2.2270 


0.8873 


0 




> 50 kg 


38 


3.1523 


3.5377 to 2.8089 


-2.6399 


-1.7089 to -3.5709 


0.8831 


G 




< 1 00 kg 


158 


2.8104 


2.9526 to 2.6750 


-2.0305 


-1.7717 to -2.2893 


0.9025 


0 




> 1 00 kg 


30 


3.0497 


3.5022 to 2.6557 


-2.3587 


-1.2593 to -3.4582 


0.8715 


G 


C F vs. BM 


< 20 kg 


138 


2.9559 


3.0735 to 2.8429 


-0.6266 


-0.4858 to -0.7675 


0.9471 


G 




> 20 kg 


62 


2.8638 


3.0353 to 2.7020 


-0.5040 


-0.1723 to -0.8358 


0.9492 


G 




< 50 kg 


153 


2.9054 


3.0013 to 2.8126 


-0.5716 


-0.4504 to -0.6928 


0.9592 


G 




> 50 kg 


47 


2.7816 


2.9651 to 2.6094 


-0.3222 


0.0436 to -0.6880 


0.9546 


E 




< 1 00 kg 


1 6-1 


2.9117 


2.9945 to 2.8312 


-0.5784 


-0.4698 to -0.6870 


0.9674 


< G, > E 




> 1 00 kg 


36 


2.7946 


3.0538 to 2.5575 


-0.3497 


0.1751 to -0.8745 


0.9351 


G, E 


L H vs. BM 


< 20 kg 


135 


2.4386 


2.5604 to 2.3225 


-1.1878 


-0.9858 to -1.3898 


0.9192 


0 




> 20 kg 


52 


2.9807 


3.2978 to 2.6940 


-2.0078 


-1.2791 to -2.7365 


0.8728 


G 




< 50 kg 


149 


2.5866 


2.7051 to 2.4734 


-1.4091 


-1.2063 to -1.6120 


0.9245 


0 




> 50 kg 


38 


3.0525 


3.4941 to 2.6667 


-2.1794 


-1.1501 to -3.2088 


0.8392 


G 




< 1 00 kg 


157 


2.6465 


2.7582 to 2.5394 


-1.5013 


-1.3060 to -1.6966 


0.9321 


0 




> 1 00 kg 


30 


2.9405 


3.4742 to 2.4888 


-1.8798 


-0.6326 to -3.1269 


0.8127 


G 


C H vs. BM 


< 20 kg 


138 


2.7768 


2.8898 to 2.6683 


-0.2550 


-0.1255 to -0.3845 


0.9447 


E 




> 20 kg 


62 


2.5793 


2.7425 to 2.4258 


0.0509 


0.3671 to -0.2653 


0.9434 


E, S 




< 50 kg 


153 


2.7188 


2.8130 to 2.6277 


-0.1941 


-0.0793 to -0.3088 


0.9551 


E 




> 50 kg 


-17 


2.5612 


2.7123 to 2.4184 


0.1031 


0.4070 to -0.2008 


0.9635 


E, S 




< 1 00 kg 


164 


2.7253 


2.8067 to 2.6463 


-0.2005 


-0.0972 to -0.3038 


0.9640 


E 




> 1 00 kg 


36 


2.6488 


2.8634 to 2.4504 


-0.0887 


0.3518 to -0.5293 


0.9500 


E, S 


L F vs. Lh 


< 20 kg 


135 


1.0776 


1.1143 to 1.0422 


-0.2261 


-0.1618 to -0.2904 


0.9619 






> 20 kg 


52 


0.9586 


1 .0632 to 0.8642 


0.0374 


0.2839 to -0.2092 


0.8666 






< 50 kg 


149 


1.0672 


1.1041 to 1.0315 


-0.2079 


-0.1414 to -0.2744 


0.9564 






> 50 kg 


38 


1.0327 


1.1337 to 0.9407 


-0.1509 


0.0956 to -0.3973 


0.9236 






< 1 00 kg 


157 


1.0613 


1.0945 to 1.0291 


-0.1983 


-0.1374 to -0.2592 


0.9624 






> 1 00 kg 


30 


1.0371 


1.1743 to 0.9160 


-0.1629 


0.1727 to -0.4984 


0.8965 




C H+F vs. BM 


< 20 kg 


138 


2.9032 


2.9989 to 2.8105 


-1.3628 


-1.2223 to -1.5032 


0.9634 






> 20 kg 


62 


2.7519 


2.8828 to 2.6269 


-1.1186 


-0.8251 to -1.4120 


0.9674 






< 50 kg 


153 


2.8383 


2.9165 to 2.7622 


-1.2743 


-1.1542 to -1.3945 


0.9714 






> 50 kg 


47 


2.6819 


2.8173 to 2.5530 


-0.9409 


-0.6286 to -1.2531 


0.9731 






< 1 00 kg 


164 


2.8409 


2.9084 to 2.7750 


-1.2778 


-1.1709 to 1.3846 


0.9771 






> 1 00 kg 


36 


2.7442 


2.9343 to 2.5663 


-1.0954 


-0.6491 to -1.5416 


0.9630 





Standardized Major Axis equation shown in the format y = mx + b. The particular theoretical scaling model (Sim.) followed by the slope is represented by G, 
geometric similarity, E, elastic similarity, or S, static similarity. Scaling patterns that fall between models are represented by > or <, and those that do not follow 
any pattern (that is, above or below all predicted models) are represented by a 0. BM, body mass; C F , femoral circumference; C H , humeral circumference; C h +f, 
total humeral and femoral circumference; L F , femoral length; L H , humeral length 
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Table 5 Slope and intercept comparisons of stylopodial scaling patterns in different mammalian size classes. 





20 kg 


50 kg 100 kg 




mCl mP faCI bP 


b'P mCl mP faCI bP b'P mCl mP bCI bP b'P 


L F vs C F 






L H vs C H 






Lp vs BM 


C F vs BM 


L H vs BM 






C H vs BM 


L F vs L(_| 






C H+F vs BM 


Standardized major axis equation shown in the format y 
0.01); and (**) at greater than 95% (P < 0.01). Otherwise, 


= mx + fa. Symbols: (°) represents differences at 90 to 95% (0.1 <P > 0.05); (*) at 9 to 99% (0.05 <P > 
P-values are > 0.1. All P-values are adjusted for multiple comparisons using FDR. All P-values are 



adjusted for multiple for multiple comparisons using FDR. Hyphens (-) represent duplicate comparisons. Significant differences using 95% CI are assessed on 
whether the intervals overlap or not; non-overlapping comparisons are indicated with an asterisk (*). mCl, slope comparisons based on 95% confidence intervals; 
mP, slope comparisons based on likelihood ratio test; bCI, intercept comparisons based 95% confidence intervals; bP, intercept comparisons based on two-tailed 
t-test; fa'P, t-test comparison of adjusted intercepts to the minimum value along the x-axis; BM, body mass; C F , femoral circumference; FDR, false discovery rate; 
L H , humeral length; C H , humeral circumference; C h+ f, total humeral and femoral circumference; L F , femoral length;mCI, slope comparisons based on 95% 
confidence intervals; mP, slope comparisons based on likelihood ratio test; faCI, intercept comparisons based 95% confidence intervals; faP, intercept comparisons 
based on two-tailed t-test; fa'P, t-test comparison of adjusted intercepts to the minimum value along the x-axis; BM, body mass; C H , humeral circumference; C h +f, 
total humeral and femoral circumference; L c , femoral circumference; 
L F , femoral length; L H , humeral length. 



follow the model predicted by elastic or static similarity. 
In contrast, the scaling of humeral length to body mass 
is more closely associated with geometric similarity, as 
no clade follows elastic similarity, two clades follow geo- 
metric similarity, and four are negatively allometric (and 
therefore are below any theoretical model). Only two 
groups (Reptilia and Ungulata) are significantly above 
geometric similarity and therefore exhibit an allometric 
pattern whereby the length of the humerus gets shorter 
as body size increases, approaching a more elastic pat- 
tern. A similar pattern is present in the scaling of 
femoral measurements with body mass. These patterns 
suggest that circumference measurements tend towards 
allometric models suggested by McMahon [95,96], 
whereas length measurements follow a pattern that, in 
general, cannot be differentiated from isometry when 
compared to body mass. 

The results presented here reveal that general scaling 
patterns of limb circumference in numerous different ter- 
restrial vertebrates, though not always strictly elastic (as 
defined by McMahon), follow consistent allometric trajec- 
tories. Such allometric relationships indicate that, interspe- 
cifically, as animals get larger their limbs increase in 
robusticity at a higher rate compared to body mass. These 
changes in the architecture of the limb in relation to size 
support the dynamic similarity hypothesis proposed by 
Rubin and Lanyon [80], which predicts changes in limb 
structure in order to maintain safety factors [86]. The 
morphological changes in limb skeletal structure, as sug- 
gested by Rubin and Lanyon [80], are not the only shifts to 
occur with size, and likely work in concert with other 
shifts, such as postural and behavioral [80,84,86,88], to 



mitigate the response of safety factors to changes in body 
size. It is important to note in this respect that this study 
only examines the external dimensions of the bones, and 
that factors such as posture may influence aspects of 
cross-sectional bone shape (such as the relative propor- 
tions between anteroposterior and mediolateral diameters) 
and internal bone distribution that are not captured here. 
Nevertheless, the highly conserved relationships between 
individual and total humeral and femoral circumference 
and body mass suggest that in terrestrial quadrupeds 
external circumference measurements of the stylopodia 
are largely independent of posture and gait, and are most 
strongly associated with size, allowing us to forward the 
hypothesis that stylopodial circumference is more closely 
associated with the body mass than with the type of force 
(that is, compression or torsion) acting on the limb. Our 
results therefore present regressions that are most suitable 
for body mass estimation of extinct terrestrial quadrupedal 
vertebrates, regardless of the group under consideration. 

Stylopodial scaling as a predictor of body mass 

As body mass is correlated with numerous physiological 
and ecological properties, (for example, [4,97]), consistent 
and accurate estimation of body mass in extinct taxa is 
important when attempting to reconstruct the dynamics 
of paleoecosystems and the life history of extinct taxa. 
The use of skeletal scaling to estimate body mass is com- 
mon in extinct mammals and birds (for example, 
[17,41,42,45,98]); however, it is less common in extinct 
non-avian archosaurs and non-mammalian synapsids 
([48,73,99] being notable exceptions). Scaling methods 
are often criticized when models are extended to more 
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distantly related stem taxa, based on arguments such as 
uneven taxon sampling (ungulate bias), its applicability to 
animals of different gaits and limb postures, as well as its 
susceptibility to residual and extreme outliers 
[51,70,72,82]. Our dataset allows us to address these 
major criticisms with empirical data. 
Ungulate uniqueness and bias 

Ungulates, and specifically artiodactyls or bovids, are con- 
sidered to exhibit scaling patterns distinct from those seen 
in other mammals. In particular, their limbs are consid- 
ered to follow an elastic trend [70,77,78,93,96,100]. In 
addition to finding elastic trends in other mammalian 
clades and in reptiles, we reject previous interpretations 
that limb scaling in ungulates is strictly elastic. In the sam- 
ple of 41 ungulates examined here (including 34 artiodac- 
tyls of which 20 are bovids), elastic similarity was 
recovered only in humeral circumference compared to 
body mass, a pattern also noted in most other clades 
(Table 1). Scaling of other limb measurements in ungu- 
lates either cannot be differentiated from geometric simi- 
larity, or follows allometric patterns significantly different 
from either theoretical model (Table 1 Sim = 0). These 
patterns are robust even when assessed at more exclusive 
levels (artiodactyls or bovids; Additional file 4, Table S3). 
As a result, a strict relationship between stylopodial scaling 
patterns and a cursorial lifestyle does not characterize 
ungulates to the exclusion of other mammalian clades. As 
such, cursorial adaptations in the limbs of ungulates may 
be limited to other stylopodial measurements (for exam- 
ple, diameter) or more distal limb bones [83,93]. 

The different patterns of limb scaling observed in 
ungulates compared to mammals [70,77,78] are often 
used to cast doubt on the utility of the Anderson method 
to estimate body mass in extinct taxa. New data confirms 
some differences in limb scaling between ungulates and 
other mammalian clades, but only in comparisons of 
limb proportions (length to circumference) and length to 
body mass (Figure lj Table 2). Circumference to body 
mass relationships reveal very high coefficients of deter- 
mination and recover no significant differences between 
ungulates and other groups of mammals. The combined 
circumference of the stylopodia revealed the strongest 
relationship to body mass (Figure 4A) and shows that a 
bias towards ungulates does not significantly alter the 
relationship; ungulates follow the same scaling relation- 
ships of this variable to body mass as other mammals, as 
well as non-avian reptiles. 

Limb scaling patterns at different gaits and limb postures 

Extant terrestrial vertebrates have a variety of gaits and 
limb postures [79,80]. In vivo strain studies have also 
shown that in mammals, limbs of taxa of smaller body 
size are primarily loaded in tension, whereas compression 
predominates in larger taxa, resulting from postural dif- 
ferences with size (also related to the dynamic similarity 



hypothesis). Such differences are also noted in reptiles 
compared to mammals, in which the former hold their 
limbs in a sprawling fashion and hence their stylopodia 
are generally loaded under tension [79] . Given these pos- 
tural differences, it was hypothesized that the scaling pat- 
tern of limb robusticity with body mass should vary in 
response to differences in limb loading [84,85] . Compari- 
sons made here between differently sized mammals, as 
well as between mammals and reptiles, reveal significant 
differences in limb proportions, as well as in the relation- 
ships between length and body mass (Figures 2 and 3; 
Tables 2 and 5), and support previous studies [78,85,94]. 
Surprisingly, however, the relationships between limb cir- 
cumference and body mass are conserved between these 
different groups, and no significant differences in circum- 
ferential scaling between differently sized animals and 
between mammals and reptiles were observed. Further- 
more, we find limited evidence for geometric similarity of 
limb robusticity in both small and large size class sam- 
ples. Instead, circumference measurements follow a gen- 
erally negative allometric pattern indicating a consistent 
increase in circumference relative to body size in both 
small and large mammals. The total stylopodial circum- 
ference (Figure 4A) provides the strongest relationship 
(R = 0.9861) and suggests that this variable is a strong 
predictor of body size for both parasagittal and sprawling 
taxa alike, and that combined limb circumference is not 
strongly correlated with limb posture and gait. These 
results concur with other studies on non-avian reptiles 
[84] and birds [101] that have shown remarkable mor- 
phological similarities of limb circumference (or dia- 
meter) between taxa with highly variable limb posture. 
Outliers 

The final criticism made towards the use of skeletal 
scaling methods, such as the Anderson method, to esti- 
mate body mass is related to the effect outliers have on 
the final predictive equation, especially at large body 
size where the sample size is low [82]. In the relation- 
ship between combined humeral and femoral circumfer- 
ence and body mass, a residual outlier test reveals that 
none of the largest animals in our greatly expanded 
dataset are residual outliers, including the buffalo, hip- 
popotamus, and elephant (Figure 4A). The only outliers 
identified here appear to be related to unique ecologies, 
such as suspension locomotion (Choloepus didactylus) 
and burrowing (Priodontes maximus, Condylura cristata, 
Parascalops breweri), which can generally be inferred 
from skeletal anatomy as a potential confounding factor 
to mass estimation based on their highly derived limb 
morphologies [102]. Both representatives of Soricomor- 
pha, C. cristata and P. breweri, are the farthest residual 
outliers, and, due to their especially apomorphic anat- 
omy, will be removed from the body mass equation. 
Only one residual outlier, the turtle Trachemys scripta is 
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Figure 4 Raw OLS regression for body mass estimation and percent prediction error of body mass proxies (A) The least-squares 
regression of the raw data between the log total stylopodial circumference and log body mass in a sample of 245 (talpids removed) mammals 
and non-avian reptiles. Regression equation shown in the format y = mx + b, and is presented along with its coefficient of determination (R 2 ), 
mean percent prediction error (PPE), standard error of the estimate (SEE), and Akaike Information Criterion (AIC). (B) Comparison of the predictive 
power of several body mass proxies based on their mean PPE. The mean PPE of each proxy is represented by the black circle along with their 
95% confidence error bars. The plot is divided into two sections representing the results from the bivariate and multiple regression analyses. 
Variables regressed against body mass are labelled along the x-axis. Labels marked with an * represent the analyses in which the data was 
phylogenetically adjusted through the use of a phylogenetic generalized least squares bivariate or multiple regression. C F , femoral circumference; 
C H , humeral circumference; L F , femoral length; L H , humeral length; OLS, ordinary least squares. 



difficult to explain, but its relatively high weight may be 
a factor of captivity or measurement error when the live 
weight was taken. 

A recent study by Packard et al. [82] suggested that 
because of its amphibious lifestyle, Hippopotamus 
amphibius may have a high body mass compared to its 
limb circumference measurement. As a result, it may 
represent a residual outlier, which justifies the removal 
of H. amphibius from the analysis. This assertion is 
based on the observation that if the raw data (non-log) 
of Anderson et al. [73] is regressed using non-linear 
least-squares regression methods, the hippopotamus, the 
bison, and the elephant are all outliers. The statistical 
merits and flaws of logarithmically transforming data 
have been heavily debated (for example, [81,82,103,104]) 
and will not be discussed further here. However, based 
on the suggestions of Packard et al. [82], we regressed 
our non-log transformed expanded dataset using a non- 
linear least squares regression, implemented with the 
'nls' function in R, and tested for potential outliers in 
the residual variance. The results indicate that 40 spe- 
cies are outliers in the non-log residual data. In order to 
test for potential significant effects, we removed the 40 
outliers and re-ran the log-log ordinary least squares 
(OLS) regression, which resulted in a slope of 2.802 ± 



0.055 and is statistically indistinguishable from that 
obtained when using the complete dataset. This suggests 
that these data points do not significantly affect the final 
result. More importantly, examination of the mean per- 
cent prediction error (PPE) indicates that despite the 
need for back-transformation, the log-transformed linear 
regression is a significantly better model for predicting 
body mass than a non-linear model (log PPE = 25% ± 
3%; non-log PPE = 43% ± 3%; Figure 4B; two-tailed 
t-test: t = -8.3245, P < < 0.0001). 

Extreme outliers, those at the upper and lower extremes 
of the dataset, also have the potential to significantly affect 
regression results. In the current dataset, there are no 
extreme outliers when the data is log transformed. How- 
ever, as is generally the case with extant size data, there are 
several positive extreme outliers in the non-log dataset. 
Thirty-three extreme outliers are observed in the body 
mass and combined humeral and femoral circumference 
data. When these taxa are removed and the log-log analysis 
is re-run (m = 2.745 ± 0.057, b = -1.099 ± 0.09), the regres- 
sion is virtually identical to that obtained with the total 
dataset. The observation that extreme positive values do 
not affect the log-log OLS regression is further supported 
by the non-significant variation in scaling coefficients 
between different mammalian size classes (Figure 3). 
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The empirical data presented here falsifies the main cri- 
ticisms forwarded against skeletal-body mass regression 
models for predicting body mass in extinct taxa, and given 
the highly conserved nature of the relationship between 
stylopodial circumference and body mass in extant terres- 
trial mammals and reptiles, suggests that circumference 
measurements represent robust proxies of body mass that 
can be applied to extinct, phylogenetically and morpholo- 
gically disparate quadrupedal terrestrial amniotes. The 
examination of eight terrestrial lissamphibian species (one 
caudatan and seven anurans [Additional file 1 Dataset]; 
not included in the final analysis) reveals that, based on 
their total stylopodial circumference and body mass, they 
plot within the range of variation present in the mamma- 
lian and reptilian dataset (Figure 2). Although at this time 
their small sample and range preclude any meaningful sta- 
tistical comparisons between the limb scaling patterns of 
lissamphibians and other tetrapods, these preliminary 
results suggest that the conserved relationship between 
body mass and proximal limb bone circumference could 
be extended to encompass the majority of quadrupedal 
terrestrial tetrapods. 

Implications for body mass estimation 

In extinct taxa, skeletal measurement proxies of body 
size are often preferred to actual body mass estimates. 
Of the limb measurements taken here, results suggest 
that the regression between the total circumference of 
the humerus and femur to body mass exhibits the stron- 
gest relationship, with the highest R 2 values, and the 
lowest PPE, standard error of the estimate (SEE), and 
Akaike Information Criterion (AIC) values of all bivari- 
ate regression models (Figure 4B; Additional file 5, 
Table S4). Among commonly cited proxies of size is 
femur length (for example, [15]). However, our analyses 
indicate that length measurements are generally poor 
indicators of size, especially compared to circumference 
(Figure 4B). Femur length exhibits an especially high 
amount of error, with a 70% mean PPE in living mam- 
mals and reptiles, compared to the 25% for the com- 
bined humeral and femoral circumference. Caution 
should therefore be taken when using limb length as 
size proxies, especially when examining taxa that 
encompass a wide phylogenetic bracket. 

Based on our results, we propose the following scaling 
equation as a robust predictor of body mass in quadru- 
pedal tetrapods: 

logBM = 2.749 • logC H+F - 1-104 (1) 

where Ch+f is the sum of humeral and femoral cir- 
cumferences needed to estimate body mass. This regres- 
sion exhibits a very high coefficient of determination 
(R 2 = 0.988), and a mean PPE of 25.6%. When adjusted 



for phylogenetic correlation/covariance between obser- 
vations (that is, species) using a phylogenetic generalized 
least squares model, the equation is: 

logBM = 2.754 -logC H+ F- 1.097 (2) 

which has an almost identical mean PPE (25%) as 
equation 1 (Figure 4B). 

In addition to examining bivariate estimates of body 
mass, we tested the predictive power of a variety of esti- 
mations based on multiple regressions by comparing 
their PPE, SEE, and AIC with those obtained from the 
bivariate regression of total circumference with body 
mass. Analyses including all proximal limb bone mea- 
surements also reveal low statistical values for both the 
raw data: 

logBM = 0.375 ■ logL H + 1.544 log Ch - 0.136 ■ logL F + 0.954 ■ logC F - 0.351 (3) 

and the phylogenetically corrected data: 

logBM = 0.212 ■ logLH + 1.347 ■ logC H - 0.533 • logLj + 0.749 • logC F - 0.76 (4) 

Equally low regression statistics were obtained for the 
multiple regression including only the circumference 
measurements, raw data: 

logBM = 1.78 • logC H + 0.939 • logC F - 0.215 (5) 

phylogenetically corrected data: 

logBM = 1.54 • logC H + 1.195 • logC F - 0.234 (6) 

None of the equations presented above are signifi- 
cantly better at predicting body mass than the combined 
humeral and femoral circumference (Equations 1 and 2); 
therefore, any of these equations are likely to provide 
robust estimates of body mass (Figure 4B). However, 
given that equations 2, 4, and 6 account for phylogenetic 
non-independence, they are likely to represent the 
statistical error in the data better than the non- 
phylogenetically corrected data. 

Not surprisingly, the masses estimated for several com- 
monly cited non-avian dinosaurs provided by Equation 2 
are more consistent with estimates generated from Ander- 
son et al. [73] than volumetric model-based estimates for 
the same taxa (Table 6). This technique is also important 
in that it is specimen-based, and therefore explicit and 
repeatable, and allows uncertainty to be expressed in the 
estimate. These predicted masses and prediction error 
ranges, when compared to previous estimates based on 
volumetric reconstructions [49,51,71], show that many 
reconstructed models underestimate body mass, some- 
times significantly below that predicted by the mean PPE 
(Table 6). Given that life-reconstructions of extinct taxa 
are important for addressing several biological questions, 
including locomotion and weight distribution, our results 
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Table 6 Body mass estimates of some commonly cited non-avian quadrupedal dinosaurs. 


Taxon 


Sp # 


C1962 


A1985§ 


P1997 


H1999 


S2001 


This study 


Iguanodon bernisartensis 


IRSNB R51 


4510 


7204 


3200 


3790 


3776 


8680 

6510-10850 


Corythosaurus 
casuarius 


ROM 845 


3820 


3030 


2800 




3079 


3620 

2720-4530 


Protoceratops 
andrewsi 


MPC-D 100/504 


177 


68 


164 




23.7 


79 

59 - 98 


Styrocosoufus 
albertensis 


rtlvlINn jj/z 


joyu 




1 pnn 

OUU 






Hj 1 U 

3280-5460 


Triceratops horridus 


NSM PV 20379 


8480 


5310 


6400 


3938 


4964 


7400 

5550-9250 


Stegosaurus mjosi 


SMA 0018* 


1780 


4131 


2200 


2530 


2611 


4950 

3720-6190 


Diplodocus longus 


USNM 10865* 


10560 


9061 


11400 


13421 


19655 


10940 
8200-13670 


Brachiosaurus brancai 


HMN Silt 


78260 


29336 


31500 


257891 


28655 


35780 

26840-44730 



Body masses estimated in this study are based on the phylogenetically corrected total stylopodial circumference equation (Equation 2) and the error range is 
based on the 25% mean prediction error obtained from the equation. References: A1985, Anderson ef al. [73]; C1962, Colbert [46]; H1999, Henderson [51]; P1997, 
Paul [71]; S2001, Seebacher [49]. Museum abbreviations in dataset file [See Additional File 1 Dataset]. * - limb measurements based off of a cast mounted at the 
Senckenberg Museum, Frankfurt, Germany; t - measurements taken from Anderson ef al. [73]; t - measurements from Redelstorff and Sander [145]; § - all 
estimates presented under A1985 are based on the equations presented in that study, but based on the limb measurements presented in dataset SI, the only 
exceptions are B. brancai, which is based on data from A1985; \ - estimate from Henderson [63]. 



provide the first objective framework with which to con- 
strain these models and test whether their assumptions 
conform to the patterns seen in extant terrestrial 
tetrapods. 

Conclusions 

Body size is an important biological descriptor, and as a 
result, is critical to understanding the paleobiology of 
extinct organisms and ecosystems. This study presents an 
extensive dataset of extant quadrupedal terrestrial 
amniotes, which allows testing of the main criticisms that 
have been put forth against the use of scaling relation- 
ships to estimate body mass in extinct taxa. Our results 
demonstrate a highly conserved relationship between 
body mass and stylopodial circumference with minimal 
variation between clades and groups of different gait and 
size, compared over a large phylogenetic scope. This gen- 
eral relationship allows the estimation of body mass in 
extinct quadrupedal groups, and is particularly important 
for a wide range of paleobiological studies, including 
growth rates [31], metabolism[36], and energetics [105], 
as well as for quantifying body size changes across major 
evolutionary transitions that are accompanied by major 
changes in gait, including shifts in the early evolutionary 
history of archosaurs [106], and in the evolution of mam- 
mals from reptile-like basal synapsids [107,108]. 

Methods 

Database construction 

In order to test the hypotheses outlined in the introduc- 
tion, we amassed an extensive dataset of limb bone mea- 
surements of 200 mammal and 47 non-avian reptile 



species from individuals that were weighed on a scale 
either prior to death or skeletonization; no extant body 
masses were estimated. For the most part, the dataset 
was built with newly measured specimens; however, it 
was augmented with published measurements from 
Christiansen and Harris [109] and Anderson et al. [73] 
[See Additional file 1, Dataset]. Measurements were 
taken from stylopodial elements, including maximum 
lengths and minimum circumference. Length measure- 
ments less than 150 mm were taken with digital callipers, 
longer dial callipers were used for measurements 
between 150 to 300 mm, and fiberglass measuring tape 
for those greater than 300 mm. Following the Anderson 
method, we use minimum circumference (thinnest region 
along the diaphysis) as a proxy for limb robusticity. In 
addition to reproducing the analysis presented by Ander- 
son et al. [73], minimum circumference should provide a 
proxy of the minimum cross-sectional area of the bone 
and therefore be related to the overall compressive 
strength of the limb. Cross-sectional area was not used 
due to the cost of collecting this data. Moreover, circum- 
ference can be more easily measured on both extant and 
fossil samples, providing a larger extant dataset and a 
more inclusive framework for future predictive studies. 
Circumference measurements were taken with thin paper 
measuring tapes of different widths, depending on the 
size of the specimen being measured. All measurements 
were taken from both sides of the specimen, where possi- 
ble, and averaged. Specimens measured are of adult body 
size. For most of the mammalian sample, the ontogenetic 
status of the specimen was determined based on the level 
of epiphyseal fusion. For the non-avian reptile sample, as 
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well as some of the largest mammals, maturity was estab- 
lished by verifying that the body mass of the measured 
specimen is similar to published reports of average body 
masses for that species (for example, [84,110-112]). In 
general, only a single specimen of each species could be 
obtained; however, in instances where more than one 
adult individual was available, the largest individual was 
used in this study. In these cases, none of the exemplars 
used seem unusually large compared to the reported 
adult body mass in that species. Finally, this study com- 
pares taxa with different growth strategies (mammals 
have determinate growth whereas growth in reptiles is 
generally considered indeterminate, but asymptotic 
[113]) that may result in differences in size structuring 
within and between populations of taxa with these differ- 
ent strategies. If, and/or how, these differences affect 
limb to body mass scaling analyses is unknown at this 
time. However, the masses of the reptiles used here fall 
within the range of what is considered typical for an 
adult of each species, and, given our large sample and the 
nature of our results (see below), we expect that these 
effects will be minimal, yet may warrant future 
consideration. 
Taxon sampling 

Taxa were chosen based on three criteria: 1) The dataset 
must include a large range in body mass, so that size- 
related postural differences can be assessed [83,114]. We 
significantly expand upon the dataset of Anderson et al. 
[73], especially for large bodied mammalians species, to 
better represent the range of variation in limb proportions 
at large sizes and address the contention that certain large 
taxa are residual outliers [82]. Due to the limitations of 
measuring limb bone circumference, taxa below 50 g were 
not included in this study. 2) The sample must encompass 
a wide phylogenetic scope, so that most major mammalian 
and reptilian clades are sampled. 3) The sample must 
include taxa from a broad spectrum of lifestyles. Our 
study focuses on terrestrial taxa; however, we have also 
included mammalian or reptilian taxa with specialized life- 
styles that have the potential to affect limb proportions 
and their relationship with body size. These include salta- 
tors (Macropodidae), brachiators (Hylobates lar, and 
Pongo pygmaeus), burro wers (for example, Talpidae), and 
amphibious taxa (Hippopotamidae and Crocodylia). The 
former three categories are associated with salient mor- 
phological features that allow these lifestyles to be recog- 
nized in the fossil record; however, the amphibious nature 
of several extinct taxa remains uncertain, and may affect 
how limb measurements scale with body mass due to the 
effects of buoyancy. 

Avian taxa were not included in the current study 
because they are bipedal. The forces exerted by body 
mass in a biped are transmitted through two limbs com- 
pared to four in a quadruped, and therefore direct 



comparisons of limb to body mass scaling between birds 
and quadrupedal tetrapods are difficult to interpret. A 
small sample of lissamphibians (one caudatan and seven 
anurans) for which live body mass is known was exam- 
ined in this study. Unfortunately, the current sample 
size does not provide enough power to make meaningful 
slope and intercept comparisons, and lissamphibians are 
not included in the main comparisons presented in the 
results section. 

Statistical analyses 

The distribution of the variables used in this study are 
all positively skewed and, therefore, highly different 
from a normal distribution; as such all variables were 
logarithmically transformed (at base 10) to approximate 
a log-normal distribution. In addition to normality, log 
transforming reduces the level of heteroscedasticity in 
the data set, minimizes the effect of extreme outliers, 
and allows for the visualization of data in a linear fash- 
ion, which simplifies the visual comparisons of slopes 
[81,115]. The benefits and complications regarding the 
application of log transformation in predictive scaling 
relationships were recently debated by Packard et al. 
[82] and Cawley and Janacek [104]. We agree with the 
latter study, which demonstrated that log-transformed 
data is preferred for this type of analysis as it assigns an 
equal weight to all data points in a regression, rather 
than upper extreme values and, furthermore, residuals 
are not significantly related to size [104]. 
Interspecific limb scaling 

All measurements were incorporated into a variety of 
bivariate plots and analyzed using the SMA line-fitting 
method (also known as Reduced Major Axis) [116]. The 
analyses compare a variety of measurements, including: 

1) limb proportions, such as femur length to humerus 
length and humerus/femur length to circumference; and 

2) limb measurements to body mass, such as humerus/ 
femur length versus body mass and humerus/femur cir- 
cumference versus body mass. All SMA analyses were 
conducted using the open-source software R [117] and 
the package 'smatr' [116,118]. 

To address the criticisms raised against the Anderson 
method, subgroups within the data were compared. 
These include comparisons between mammalian clades 
for which a sample size greater than ten could be 
obtained, such as Ungulata, Carnivora, Marsupialia, 
Euarchonta, and Glires. In addition, comparisons were 
made between different size classes. Size class compari- 
sons were based on three body mass thresholds: 20 kg, 
which was previously used by Economos [94] to show 
differential scaling in mammals, and it is also thought to 
represent the lower size limit for migratory mammals 
and hence may affect limb scaling patterns [4]; 50 kg, a 
threshold at which mammalian limb scaling has been 
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previously noted to vary [93]; and 100 kg, previously 
used by Bertram and Biewener [78], and which allows 
better representation of the large-bodied portion of the 
dataset. 

Fitted lines of different subsamples were compared 
based on the 95% confidence intervals of the slope and 
intercept, and differences were considered to be signifi- 
cant when intervals did not overlap. However, given that 
statistical significance can still be obtained even though 
confidence intervals overlap [81], we conducted a series 
of pairwise comparisons of the slopes and intercepts 
using a likelihood ratio test and a t-test, respectively. 
These tests have the added benefit that they can be cor- 
rected for errors associated with multiple comparisons 
using the FDR, an approach that, as far as we are aware, 
cannot be applied to confidence intervals [119,120]. The 
likelihood ratio test was implemented with the 'smatr' 
package [116,118]. Conventional methods for comparing 
intercepts (for example, ANCOVA, Wald statistic, and 
traditional t-tests) alter the original intercepts by forcing 
a common slope to each group being analyzed 
[115,116]. Although this may make statistical sense 
[116], it involves permuting the best fit-line away from 
the original biological data. As a result, here we com- 
pare intercepts using a two-tailed t-test based on equa- 
tion 18.25 of Zar [115]: 

t = f>l - f>2)/SE S MA 

where bi and bi represent the pair of intercepts being 
compared, and SE SMA is the standard error of the differ- 
ence in SMA intercepts, calculated as per equation 18.26 
of Zar [115]. Comparing intercepts using this method has 
the added benefit of allowing comparisons of y-values 
along the true SMA lines at x-values other than 0. This is 
advisable when comparing biological scaling lines 
because first, the intercept at x = 0 is an extrapolation of 
the line beyond the range of the data [115], but perhaps 
more importantly given the type of data used here, a 
value of x = 0 is biologically meaningless. As a result, in 
addition to presenting the results of the t-test at the true 
intercept, we compare y-values at the minimum value of 
the total dataset along the x-axis using the same t-test 
method. The results of the two intercept comparison 
methods described above are presented, and all P-values 
are corrected using the FDR [119,120], implemented with 
the 'p. adjust' function in R. In total, 14 pairwise compari- 
sons are made for each analysis. 

In addition to comparing limb scaling patterns between 
different groups, scaling coefficients were used to test the- 
oretical scaling models, such as geometric (GS), elastic 
(ES), and static (SS) similarity [95,96]. The models predict 
that under GS: circumference <* length; mass « length ; 
mass « circumference , under ES: circumference « 



length 1 ' 5 ; mass « length 4 ; mass « circumference 273 , and 
finally under SS: circumference « length ; mass <* length ; 
mass « circumference 2 5 . These models were tested against 
the empirical slopes obtained in this study using the 
method described by Warton et al. [116]. 
Phylogenetic independent contrasts 

In addition to plotting the raw data, as was done by Ander- 
son et al. [73], we calculated the phylogenetic independent 
contrasts (PIC) for the entire dataset in order to correct for 
non-independence of the raw data as a result of common 
ancestry [121]. We compared the scaling coefficients from 
the raw and phylogenetically corrected data to test if non- 
independence significantly alters the scaling patterns 
obtained from the raw data. The phylogenetic tree [See 
Additional file 6, Figure SI] was constructed in Mesquite 
[122], based on recent phylogenetic analyses obtained for 
extant Mammalia [123], and non-avian reptiles [124-130]. 
Branch lengths are measured in millions of years. For the 
mammalian portion of the phylogeny we used the branch 
lengths of Bininda-Emonds et al. [123]. Branch lengths in 
the reptile portion of the tree were largely calculated using 
molecular estimates of divergence times [131-138]. How- 
ever, species-level divergence times of some taxa, such as 
turtles, are poorly constrained, and as a result, we esti- 
mated the branch lengths based on the oldest known fossil 
occurrence for the species or genus obtained from the 
Paleobiology Database http://paleodb.org/. 

Both theoretical and empirical studies of PIC state that 
in order for contrasts to receive equal weighting and 
thereby conform to the assumptions stipulated by para- 
metric analyses and statistics, branch lengths must be 
adjusted so that contrasts are standardized, and therefore 
have a non-significant relationship with their standard 
deviation [139]. The criterion was not met by the raw 
branch lengths, but was obtained by transforming the 
branch lengths by their natural log. Branch lengths were 
assigned and transformed in Mesquite and the tree file 
was imported into R, where contrasts were calculated 
using the APE' package [140]. A best fit line was calcu- 
lated for the contrasts using a SMA in the package 
'smatr' [116], which allows for the line to pass through 
the origin, as stipulated by Garland et al. [139]. The PIC 
slopes for the entire dataset and subsets (as described 
above) were compared to slopes obtained from the raw 
data using the 95% confidence intervals. 
Body mass estimation 

In order to provide the best estimation parameter for body 
mass, a Model I (OLS) regression analysis is preferred. It is 
the most appropriate model for estimating a value of y 
based on x, as it accounts for the complete error of the y 
variable that can be explained by the x variable [81,141]. 
The analysis was performed on the entire dataset (N = 
247) between body mass and a variety of limb measure- 
ments in order to test for the best predictor. The 
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'goodness of fit' of a predictor was examined based on the 
commonly used coefficient of determination (R ); how- 
ever, this value is considered a poor representation of the 
strength of a regression, due largely to its strong associa- 
tion with sample size [103]. Therefore, given the large 
dataset presented here, we provide three additional 
metrics, including the SEE, the PPE, and the AIC. The 
mean PPE is perhaps the best metric of regression 
strength for these types of analyses as it deals with the pre- 
dictive strength of the relationship in relation to the non- 
logged data. In addition, the PPE has the added benefit of 
allowing for calculation of confidence intervals around the 
mean PPE, and therefore facilitates comparison between 
the mean PPE of different models. 

In addition to the OLS bivariate regression outlined 
above, we included all limb measurements into a suite of 
multiple regression analyses and, given that this techni- 
que is highly recommended [43,47,142], tested if they are 
significantly better predictors of body mass than bivariate 
regressions. The predictive accuracy of each analysis was 
compared using SEE, PPE, and AIC. Finally, because 
none of the bivariate or multiple regressions account for 
correlation and covariance of morphology between taxa 
as a result of phylogenetic history, we re-analyzed the 
data using a phylogenetic generalized least squares 
approach [143], a method recently applied to estimate 
body mass in extinct bovids [144]. Application of this 
method is based on the same phylogenetic tree, branch 
lengths [See Additional file 6, Figure SI], and a Brownian 
motion model of evolution. This approach was imple- 
mented using the APE' and 'nlme' packages in R. 

Additional material 



elastic similarity, or S, static similarity. Scaling patterns that fall between 
models are represented by > or <, and those that do not follow any 
pattern (that is, above or below all predicted models) are represented by 
a 0. 

Additional file 5: Table S4. Predictive power of various body mass 
estimation equations. Bivariate and multiple regression statistics for 
various body mass proxies discussed here (that is, circumference and 
length of the humerus and femur). Statistics include the Percent 
Prediction Error (PPE), along with its upper and lower 95% PPE 
Confidence Intervals (PPE CI), the Standard Error of the Estimate (SEE), 
the Coefficient of Determination (R 2 ), and the Akaike Information 
Criterion Score (AIC). 

Additional file 6: Figure SI Phylogenetic tree of mammalian and 
reptilian taxa included in this study. Topology is based on multiple 
published analyses mentioned in the text. Numbers indicate the branch 
lengths used in this study, measured in millions of years. Terminal branch 
lengths are most often given next to the species name. 
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